Index-Based Solutions for Efficient Density Peak Clustering

نویسندگان

چکیده

Density Peak Clustering (DPC), a popular density-based clustering approach, has received considerable attention from the research community primarily due to its simplicity and fewer-parameter requirement. However, resultant clusters obtained using DPC are influenced by sensitive parameter $d_c$ , which depends on data distribution requirements of different users. Besides, original algorithm requires visiting large number objects, making it slow. To this end, paper investigates index-based solutions for DPC. Specifically, we propose two list-based index methods viz. (i) simple List Index, (ii) an advanced Cumulative Histogram Index. Efficient query algorithms proposed these indices significantly avoids irrelevant comparisons at cost space. For memory-constrained systems, further introduce approximate solution above allows substantial reduction in space cost, provided that slight inaccuracies admissible. Furthermore, owing considerably lower memory existing tree-based structures, also present effective pruning techniques efficient support Quadtree Index R-tree Finally, practically evaluate all findings results, set extensive experiments six synthetic real datasets. The experimental insights can help guide selecting befitting index.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DenPEHC: Density peak based efficient hierarchical clustering

Existing hierarchical clustering algorithms involve a flat clustering component and an additional agglomerative or divisive procedure. This paper presents a density peak based hierarchical clustering method (DenPEHC), which directly generates clusters on each possible clustering layer, and introduces a grid granulation framework to enable DenPEHC to cluster large-scale and high-dimensional (LSH...

متن کامل

Efficient Anytime Density-based Clustering

Many clustering algorithms suffer from scalability problems on massive datasets and do not support any user interaction during runtime. To tackle these problems, anytime clustering algorithms are proposed. They produce a fast approximate result which is continuously refined during the further run. Also, they can be stopped or suspended anytime and provide an answer. In this paper, we propose a ...

متن کامل

An Efficient Density-based Clustering Algorithm for Higher-Dimensional Data

DBSCAN is a typically used clustering algorithm due to its clustering ability for arbitrarily-shaped clusters and its robustness to outliers. Generally, the complexity of DBSCAN is O(n) in the worst case, and it practically becomes more severe in higher dimension. Grid-based DBSCAN is one of the recent improved algorithms aiming at facilitating efficiency. However, the performance of grid-based...

متن کامل

Efficient Join-Index-Based Spatial-Join Processing: A Clustering Approach

A join-index is a data structure used for processing join queries in databases. Join-indices use precomputation techniques to speed up online query processing and are useful for data sets which are updated infrequently. The I/O cost of join computation using a join-index with limited buffer space depends primarily on the page-access sequence used to fetch the pages of the base relations. Given ...

متن کامل

Efficient Join-Index-Based Join Processing: A Clustering Approach

A Join Index is a data structure used for processing join queries in databases. Join indices use pre-computation techniques to speed up online query processing and are useful for data-sets which are updated infrequently. The cost of join computation using a join-index with limited buuer space depends primarily on the page-access sequence used to fetch the pages of the base relations. Given the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Knowledge and Data Engineering

سال: 2022

ISSN: ['1558-2191', '1041-4347', '2326-3865']

DOI: https://doi.org/10.1109/tkde.2020.3004221